CSE408
Minimum Spanning
Tree(Prims,Kruskshal)
Lecture #26
Minimum Spanning Tree
A minimum spanning tree connects all nodes
in a given graph
A MST must be a connected and undirected
graph
A MST can have weighted edges
Multiple MSTs can exist within a given
undirected graph
More about Multiple MSTs
Multiple MSTs can be generated depending on
which algorithm is used
If you wish to have an MST start at a specific
node
However, if there are weighted edges and all
weighted edges are unique, only one MST will
exist
Real Life Application of a MST
A cable TV company is laying cable in a new
neighborhood. If it is constrained to bury the cable
only along certain paths, then there would be a
graph representing which points are connected by
those paths. Some of those paths might be more
expensive, because they are longer, or require the
cable to be buried deeper; these paths would be
represented by edges with larger weights. A
minimum spanning tree would be the network with
the lowest total cost.
Prim’s Algorithm
Initially discovered in 1930 by Vojtěch Jarník,
then rediscovered in 1957 by Robert C. Prim
Similar to Dijkstra’s Algorithm regarding a
connected graph
Starts off by picking any node within the graph
and growing from there
Prim’s Algorithm Cont.
Label the starting node, A, with a 0 and all
others with infinite
Starting from A, update all the connected
nodes’ labels to A with their weighted edges if
it less than the labeled value
Find the next smallest label and update the
corresponding connecting nodes
Repeat until all the nodes have been visited
Prim’s Algorithm Example
Prim’s Algorithm Example
Kruskal’s Algorithm
Created in 1957 by Joseph Kruskal
Finds the MST by taking the smallest weight in the graph and
connecting the two nodes and repeating until all nodes are
connected to just one tree
This is done by creating a priority queue using the weights as
keys
Each node starts off as its own tree
While the queue is not empty, if the edge retrieved connects
two trees, connect them, if not, discard it
Once the queue is empty, you are left with the minimum
spanning tree
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
CSE408
Minimum Spanning
Tree(Prims,Kruskshal)
Lecture #26
Minimum Spanning Tree
A minimum spanning tree connects all nodes
in a given graph
A MST must be a connected and undirected
graph
A MST can have weighted edges
Multiple MSTs can exist within a given
undirected graph
More about Multiple MSTs
Multiple MSTs can be generated depending on
which algorithm is used
If you wish to have an MST start at a specific
node
However, if there are weighted edges and all
weighted edges are unique, only one MST will
exist
Real Life Application of a MST
A cable TV company is laying cable in a new
neighborhood. If it is constrained to bury the cable
only along certain paths, then there would be a
graph representing which points are connected by
those paths. Some of those paths might be more
expensive, because they are longer, or require the
cable to be buried deeper; these paths would be
represented by edges with larger weights. A
minimum spanning tree would be the network with
the lowest total cost.
Prim’s Algorithm
Initially discovered in 1930 by Vojtěch Jarník,
then rediscovered in 1957 by Robert C. Prim
Similar to Dijkstra’s Algorithm regarding a
connected graph
Starts off by picking any node within the graph
and growing from there
Prim’s Algorithm Cont.
Label the starting node, A, with a 0 and all
others with infinite
Starting from A, update all the connected
nodes’ labels to A with their weighted edges if
it less than the labeled value
Find the next smallest label and update the
corresponding connecting nodes
Repeat until all the nodes have been visited
Prim’s Algorithm Example
Prim’s Algorithm Example
Kruskal’s Algorithm
Created in 1957 by Joseph Kruskal
Finds the MST by taking the smallest weight in the graph and
connecting the two nodes and repeating until all nodes are
connected to just one tree
This is done by creating a priority queue using the weights as
keys
Each node starts off as its own tree
While the queue is not empty, if the edge retrieved connects
two trees, connect them, if not, discard it
Once the queue is empty, you are left with the minimum
spanning tree
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
Kruskal’s Algorithm Example
CSE408
Dijkstra,Huffmancoding
Lecture # 27
Dijkstra’s Algorithm
Edge Relaxation
Consider an edge e
=
(u,z) such
that
u is the vertex most recently
added to the cloud
z is not in the cloud
The relaxation of edge e
updates distance d(z) as
follows:
d(z) min{d(z),d(u) + weight(e)}
d(z) = 75
d(u) = 50
z
su
d(z) = 60
d(u) = 50
z
su
e
e
Example
CB
A
E
D
F
0
428
4
8
7 1
2 5
2
3 9
CB
A
E
D
F
0
328
511
4
8
71
2 5
2
3 9
CB
A
E
D
F
0
328
5 8
4
8
71
25
2
39
CB
A
E
D
F
0
327
5 8
4
8
71
2 5
2
39
Example (cont.)
CB
A
E
D
F
0
327
5 8
4
8
71
25
2
39
CB
A
E
D
F
0
327
5 8
4
8
71
2 5
2
39
Dijkstra’s Algorithm
A priority queue stores
the vertices outside the
cloud
Key: distance
Element: vertex
Locator-based methods
insert(k,e) returns a
locator
replaceKey(l,k) changes
the key of an item
We store two labels with
each vertex:
Distance (d(v) label)
locator in priority queue
Algorithm DijkstraDistances(G, s)
Q new heap-based priority queue
for all v G.vertices()
if v = s
setDistance(v, 0)
else
setDistance(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
for all e G.incidentEdges(u)
{ relax edge e }
z G.opposite(u,e)
r getDistance(u) + weight(e)
if r < getDistance(z)
setDistance(z,r)
Q.replaceKey(getLocator(z),r)
Analysis
Graph operations
Method incidentEdges is called once for each vertex
Label operations
We set/get the distance and locator labels of vertex z O(deg(z)) times
Setting/getting a label takes O(1) time
Priority queue operations
Each vertex is inserted once into and removed once from the priority
queue, where each insertion or removal takes O(log n) time
The key of a vertex in the priority queue is modified at most deg(w) times,
where each key change takes O(log n) time
Dijkstra’s algorithm runs in O((n + m) log n) time provided the graph is
represented by the adjacency list structure
Recall that Sv deg(v) = 2m
The running time can also be expressed as O(m log n) since the graph
is connected
Extension
Using the template
method pattern, we can
extend Dijkstra’s
algorithm to return a
tree of shortest paths
from the start vertex to
all other vertices
We store with each
vertex a third label:
parent edge in the
shortest path tree
In the edge relaxation
step, we update the
parent label
Algorithm DijkstraShortestPathsTree(G, s)
for all v G.vertices()
setParent(v, )
for all e G.incidentEdges(u)
{ relax edge e }
z G.opposite(u,e)
r getDistance(u) + weight(e)
if r < getDistance(z)
setDistance(z,r)
setParent(z,e)
Q.replaceKey(getLocator(z),r)
Why Dijkstra’s Algorithm Works
Dijkstra’s algorithm is based on the greedy method.
It adds vertices by increasing distance.
CB
A
E
D
F
0
327
5 8
4
8
71
25
2
39
Suppose it didn’t find all shortest
distances. Let F be the first wrong
vertex the algorithm processed.
When the previous node, D, on the
true shortest path was considered, its
distance was correct.
But the edge (D,F) was relaxed at that
time!
Thus, so long as d(F)>d(D), F’s distance
cannot be wrong. That is, there is no
wrong vertex.
Purpose of Huffman Coding
Proposed by Dr. David A. Huffman in 1952
A Method for the Construction of Minimum
Redundancy Codes”
Applicable to many forms of data
transmission
Our example: text files
The Basic Algorithm
Huffman coding is a form of statistical coding
Not all characters occur with the same
frequency!
Yet all characters are allocated the same amount
of space
1 char = 1 byte, be it e or x
The Basic Algorithm
Any savings in tailoring codes to frequency of
character?
Code word lengths are no longer fixed like
ASCII.
Code word lengths vary and will be shorter for
the more frequently used characters.
The (Real) Basic Algorithm
1. Scan text to be compressed and tally
occurrence of all characters.
2. Sort or prioritize characters based on number of
occurrences in text.
3. Build Huffman code tree based on
prioritized list.
4. Perform a traversal of tree to determine all code words.
5. Scan text again and create new file using the
Huffman codes.
Building a Tree Scan the original text
Consider the following short text:
Eerie eyes seen near lake.
Count up the occurrences of all characters in the
text
Building a Tree Scan the original text
Eerie eyes seen near lake.
What characters are present?
E e r i space
y s n a r l k .
Building a Tree Scan the original text
Eerie eyes seen near lake.
What is the frequency of each character in the text?
Char Freq. Char Freq. Char Freq.
E 1 y 1 k 1
e 8 s 2 . 1
r 2 n 2
i 1 a 2
space 4 l 1
Building a Tree Prioritize characters
Create binary tree nodes with character
and frequency of each character
Place nodes in a priority queue
The lower the occurrence, the higher the
priority in the queue
Building a Tree Prioritize characters
Uses binary tree nodes
public class HuffNode
{
public char myChar;
public int myFrequency;
public HuffNode myLeft, myRight;
}
priorityQueue myQueue;
Building a Tree
The queue after inserting all nodes
Null Pointers are not shown
E
1
i
1
y
1
l
1
k
1
.
1
r
2
s
2
n
2
a
2
sp
4
e
8
Building a Tree
While priority queue contains two or more nodes
Create new node
Dequeue node and make it left subtree
Dequeue next node and make it right subtree
Frequency of new node equals sum of frequency of left
and right children
Enqueue new node back into queue
Building a Tree
E
1
i
1
y
1
l
1
k
1
.
1
r
2
s
2
n
2
a
2
sp
4
e
8
Building a Tree
E
1i
1
y
1
l
1
k
1
.
1
r
2
s
2
n
2
a
2
sp
4
e
8
2
E
1i
1
y
1
l
1
k
1
.
1
r
2
s
2
n
2
a
2
sp
4
e
8
2
Building a Tree
E
1i
1
k
1
.
1
r
2
s
2
n
2
a
2
sp
4
e
8
2
y
1l
1
2
Building a Tree
E
1i
1
k
1
.
1
r
2
s
2
n
2
a
2
sp
4
e
8
2
y
1l
1
2
Building a Tree
E
1i
1
r
2
s
2
n
2
a
2
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
Building a Tree
E
1i
1
r
2
s
2
n
2
a
2
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
Building a Tree
E
1i
1
n
2
a
2
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
Building a Tree
E
1i
1
n
2
a
2
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
4
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
44
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
44
6
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
446
What is happening to the characters with a low number of occurrences?
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
46
8
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
468
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
46
8
10
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
446
810
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
46
8
10
16
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
468
10 16
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
468
10 16
26
Building a Tree
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
468
10 16
26
After enqueueing this node
there is only one node left in
priority queue.
Building a Tree
Dequeue the single node left in the
queue.
This tree contains the new code
words for each character.
Frequency of root node should
equal number of characters in text.
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
468
10 16
26
Eerie eyes seen near lake. 26 characters
Building a Tree
Encoding the File Traverse Tree for Codes
Perform a traversal of the
tree to obtain new code
words
Going left is a 0 going right is
a 1
code word is only completed
when a leaf node is reached
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
468
10 16
26
Encoding the File Traverse Tree for Codes
Char Code
E 0000
i 0001
y 0010
l 0011
k 0100
. 0101
space 011
e 10
r 1100
s 1101
n 1110
a 1111 E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
468
10 16
26
Encoding the File
Rescan text and encode file
using new code words
Eerie eyes seen near lake.
Char Code
E 0000
i 0001
y 0010
l 0011
k 0100
. 0101
space 011
e 10
r 1100
s 1101
n 1110
a 1111
000010110000011001110001010110110100
111110101111110001100111111010010010
1
Why is there no need for a
separator character?
.
Have we made things any
better?
73 bits to encode the text
ASCII would take 8 * 26 =
208 bits
000010110000011001110001010110110100
111110101111110001100111111010010010
1
If modified code used 4 bits per
character are needed. Total bits
4 * 26 = 104. Savings not as great.
Encoding the File
Decoding the File
How does receiver know what the codes are?
Tree constructed for each text file.
Considers frequency for each file
Big hit on compression, especially for smaller files
Tree predetermined
based on statistical analysis of text files or file types
Data transmission is bit based versus byte based
Once receiver has tree it
scans incoming bit stream
0 go left
1 go right
E
1i
1
sp
4
e
8
2
y
1l
1
2
k
1.
1
2
r
2s
2
4
n
2a
2
4
468
10 16
26
101000110111101111011
11110000110101
Decoding the File
Summary
Huffman coding is a technique used to compress files
for transmission
Uses statistical coding
more frequently used symbols have shorter code words
Works well for text and fax transmissions
An application that uses several data structures
CSE408
Single Source Shortest
path
Lecture #28
Outline and Reading
Weighted graphs
Shortest path problem
Shortest path properties
Dijkstra’s algorithm
Algorithm
Edge relaxation
The Bellman-Ford algorithm
All-pairs shortest paths
Weighted Graphs
In a weighted graph, each edge has an associated numerical value,
called the weight of the edge
Edge weights may represent, distances, costs, etc.
Example:
In a flight route graph, the weight of an edge represents the distance in
miles between the endpoint airports
ORD PVD
MIA
DFW
SFO
LAX
LGA
HNL
Shortest Path Problem
Given a weighted graph and two vertices u and v, we want to find a
path of minimum total weight between u and v.
Length of a path is the sum of the weights of its edges.
Example:
Shortest path between Providence and Honolulu
Applications
Internet packet routing
Flight reservations
Driving directions
ORD PVD
MIA
DFW
SFO
LAX
LGA
HNL
Shortest Path Properties
Property 1:
A subpath of a shortest path is itself a shortest path
Property 2:
There is a tree of shortest paths from a start vertex to all the other vertices
Example:
Tree of shortest paths from Providence
ORD PVD
MIA
DFW
SFO
LAX
LGA
HNL
Dijkstra’s Algorithm
Edge Relaxation
Consider an edge e
=
(u,z) such
that
u is the vertex most recently
added to the cloud
z is not in the cloud
The relaxation of edge e
updates distance d(z) as
follows:
d(z) min{d(z),d(u) + weight(e)}
d(z) = 75
d(u) = 50
z
su
d(z) = 60
d(u) = 50
z
su
e
e
Example
CB
A
E
D
F
0
428
4
8
7 1
2 5
2
3 9
CB
A
E
D
F
0
328
511
4
8
71
2 5
2
3 9
CB
A
E
D
F
0
328
5 8
4
8
71
25
2
39
CB
A
E
D
F
0
327
5 8
4
8
71
2 5
2
39
Example (cont.)
CB
A
E
D
F
0
327
5 8
4
8
71
25
2
39
CB
A
E
D
F
0
327
5 8
4
8
71
2 5
2
39
Dijkstra’s Algorithm
A priority queue stores
the vertices outside the
cloud
Key: distance
Element: vertex
Locator-based methods
insert(k,e) returns a
locator
replaceKey(l,k) changes
the key of an item
We store two labels with
each vertex:
Distance (d(v) label)
locator in priority queue
Algorithm DijkstraDistances(G, s)
Q new heap-based priority queue
for all v G.vertices()
if v = s
setDistance(v, 0)
else
setDistance(v, )
l Q.insert(getDistance(v), v)
setLocator(v,l)
while Q.isEmpty()
u Q.removeMin()
for all e G.incidentEdges(u)
{ relax edge e }
z G.opposite(u,e)
r getDistance(u) + weight(e)
if r < getDistance(z)
setDistance(z,r)
Q.replaceKey(getLocator(z),r)
Analysis
Graph operations
Method incidentEdges is called once for each vertex
Label operations
We set/get the distance and locator labels of vertex z O(deg(z)) times
Setting/getting a label takes O(1) time
Priority queue operations
Each vertex is inserted once into and removed once from the priority
queue, where each insertion or removal takes O(log n) time
The key of a vertex in the priority queue is modified at most deg(w) times,
where each key change takes O(log n) time
Dijkstra’s algorithm runs in O((n + m) log n) time provided the graph is
represented by the adjacency list structure
Recall that Sv deg(v) = 2m
The running time can also be expressed as O(m log n) since the graph
is connected
Extension
Using the template
method pattern, we can
extend Dijkstra’s
algorithm to return a
tree of shortest paths
from the start vertex to
all other vertices
We store with each
vertex a third label:
parent edge in the
shortest path tree
In the edge relaxation
step, we update the
parent label
Algorithm DijkstraShortestPathsTree(G, s)
for all v G.vertices()
setParent(v, )
for all e G.incidentEdges(u)
{ relax edge e }
z G.opposite(u,e)
r getDistance(u) + weight(e)
if r < getDistance(z)
setDistance(z,r)
setParent(z,e)
Q.replaceKey(getLocator(z),r)
Why Dijkstra’s Algorithm Works
Dijkstra’s algorithm is based on the greedy method.
It adds vertices by increasing distance.
CB
A
E
D
F
0
327
5 8
4
8
71
25
2
39
Suppose it didn’t find all shortest
distances. Let F be the first wrong
vertex the algorithm processed.
When the previous node, D, on the
true shortest path was considered, its
distance was correct.
But the edge (D,F) was relaxed at that
time!
Thus, so long as d(F)>d(D), F’s distance
cannot be wrong. That is, there is no
wrong vertex.
Why It Doesn’t Work for Negative-Weight Edges
If a node with a negative incident
edge were to be added late to the
cloud, it could mess up distances
for vertices already in the cloud.
CB
A
E
D
F
0
457
5 9
4
8
71
25
6
0-8
Dijkstra’s algorithm is based on the greedy method. It adds vertices by
increasing distance.
Cs true distance is 1, but it is already
in the cloud with d(C)=5!
Bellman-Ford Algorithm
Works even with negative-
weight edges
Must assume directed edges
(for otherwise we would have
negative-weight cycles)
Iteration i finds all shortest
paths that use i edges.
Running time: O(nm).
Can be extended to detect a
negative-weight cycle if it
exists
How?
Algorithm BellmanFord(G, s)
for all v G.vertices()
if v = s
setDistance(v, 0)
else
setDistance(v, )
for i 1 to n-1 do
for each e G.edges()
{ relax edge e }
u G.origin(e)
z G.opposite(u,e)
r getDistance(u) + weight(e)
if r < getDistance(z)
setDistance(z,r)
-2
Bellman-Ford Example
0
4
8
7 1
-2 5
-2
3 9
0
4
8
7 1
-2 5
3 9
Nodes are labeled with their d(v) values
-2
-28
0
4
4
8
7 1
-2 5
3 9
8-2 4
-1
5
6
19
-25
0
1
-1
9
4
8
7 1
-2 5
-2
3 9 4
All-Pairs Shortest Paths
Find the distance between
every pair of vertices in a
weighted directed graph G.
We can make n calls to
Dijkstra’s algorithm (if no
negative edges), which
takes O(nmlog n) time.
Likewise, n calls to Bellman-
Ford would take O(n2m)
time.
We can achieve O(n3) time
using dynamic
programming (similar to the
Floyd-Warshall algorithm).
Algorithm AllPair(G) {assumes vertices 1,…,n}
for all vertex pairs (i,j)
if i = j
D0[i,i] 0
else if (i,j) is an edge in G
D0[i,j] weight of edge (i,j)
else
D0[i,j] +
for k 1 to n do
for i 1 to n do
for j 1 to n do
Dk[i,j] min{Dk-1[i,j], Dk-1[i,k]+Dk-1[k,j]}
return Dn
k
j
i
Uses only vertices
numbered 1,…,k-1 Uses only vertices
numbered 1,…,k-1
Uses only vertices numbered 1,…,k
(compute weight of this edge)
CSE408
All pairs shortest path
Lecture #29
All pairs shortest path
The problem: find the shortest path between every
pair of vertices of a graph
The graph: may contain negative edges but no
negative cycles
A representation: a weight matrix where
W(i,j)=0 if i=j.
W(i,j)= if there is no edge between i and j.
W(i,j)=“weight of edge”
Note: we have shown principle of optimality
applies to shortest path problems
The weight matrix and the graph
1 2 3 4 5
1 0 1 1 5
2 9 0 3 2
3 0 4
4 2 0 3
5 3 0
v1v2
v3
v4
v5
3
2
2
4
1
3
1
9
3
5
The subproblems
How can we define the shortest distance di,j in
terms of “smaller” problems?
One way is to restrict the paths to only include
vertices from a restricted subset.
Initially, the subset is empty.
Then, it is incrementally increased until it
includes all the vertices.
The subproblems
Let D(k)[i,j]=weight of a shortest path from
vi to vj using only vertices from {v1,v2,…,vk}
as intermediate vertices in the path
D(0)=W
D(n)=D which is the goal matrix
How do we compute D(k) from D(k-1) ?
The Recursive Definition:
Case 1: A shortest path from vi to vj restricted to using only
vertices from {v1,v2,…,vk} as intermediate vertices does not
use vk. Then D(k)[i,j]= D(k-1)[i,j].
Case 2: A shortest path from vi to vj restricted to using only
vertices from {v1,v2,…,vk} as intermediate vertices does use
vk. Then D(k)[i,j]= D(k-1)[i,k]+ D(k-1)[k,j].
Vi
Vj
Vk
Shortest Path using intermediate vertices { V1, . . . Vk -1 }
Shortest path using intermediate vertices
{V1, . . . Vk }
The recursive definition
Since
D(k)[i,j]= D(k-1)[i,j] or
D(k)[i,j]= D(k-1)[i,k]+ D(k-1)[k,j].
We conclude:
D(k)[i,j]= min{ D(k-1)[i,j], D(k-1)[i,k]+ D(k-1)[k,j] }.
Vi
Vj
Vk
Shortest Path using intermediate vertices { V1, . . . Vk -1 }
Shortest path using intermediate vertices
{V1, . . . Vk }
The pointer array P
Used to enable finding a shortest path
Initially the array contains 0
Each time that a shorter path from i to j is found
the k that provided the minimum is saved (highest
index node on the path from i to j)
To print the intermediate nodes on the shortest
path a recursive procedure that print the shortest
paths from i and k, and from k to j can be used
Floyd's Algorithm Using n+1 D matrices
Floyd//Computes shortest distance between all pairs of
//nodes, and saves P to enable finding shortest paths
1. D0 W // initialize D array to W [ ]
2. P 0 // initialize P array to [0]
3. for k 1 to n
4. do for i 1 to n
5. do for j 1 to n
6. if (Dk-1[ i, j ] > Dk-1 [ i, k ] + Dk-1 [ k, j ] )
7. then Dk[ i, j ] Dk-1 [ i, k ] + Dk-1 [ k, j ]
8. P[ i, j ] k;
9. else Dk[ i, j ] Dk-1 [ i, j ]
Example
W = D0 =40 5
2 0
-3 0
1 2 3
1
2
3
00 0
0 0 0
0 0 0
1 2 3
1
2
3
P =
1
2
3
5
-3
2
4
D1 =40 5
2 0 7
-3 0
1 2 3
1
2
3
00 0
0 0 1
0 0 0
1 2 3
1
2
3
P =
1
2
3
5
-3
2
4
k = 1
Vertex 1
can be
intermediat
e node
D1[2,3] = min( D0[2,3], D0[2,1]+D0[1,3] )
= min (, 7)
= 7
D1[3,2] = min( D0[3,2], D0[3,1]+D0[1,2] )
= min (-3,)
= -3
40 5
2 0
-3 0
1 2 3
1
2
3
D0 =
D2 =40 5
2 0 7
-1 -3 0
1 2 3
1
2
3
00 0
0 0 1
2 0 0
1 2 3
1
2
3
P =
D2[1,3] = min( D1[1,3], D1[1,2]+D1[2,3] )
= min (5, 4+7)
= 5
D2[3,1] = min( D1[3,1], D1[3,2]+D1[2,1] )
= min (, -3+2)
= -1
1
2
3
5
-3
2
4
D1 =40 5
2 0 7
-3 0
1 2 3
1
2
3
k = 2
Vertices 1,
2 can be
intermediat
e
D3 =20 5
2 0 7
-1 -3 0
1 2 3
1
2
3
30 0
0 0 1
2 0 0
1 2 3
1
2
3
P =
D3[1,2] = min(D2[1,2], D2[1,3]+D2[3,2] )
= min (4, 5+(-3))
= 2
D3[2,1] = min(D2[2,1], D2[2,3]+D2[3,1] )
= min (2, 7+ (-1))
= 2
D2 =
40 5
2 0 7
-1 -3 0
1 2 3
1
2
3
1
2
3
5
-3
2
4
k = 3
Vertices 1,
2, 3 can be
intermediat
e
Floyd's Algorithm: Using 2 D matrices
Floyd
1. D W // initialize D array to W [ ]
2. P 0 // initialize P array to [0]
3. for k 1 to n
// Computing D’ from D
4. do for i 1 to n
5. do for j 1 to n
6. if (D[ i, j ] > D[ i, k ] + D[ k, j ] )
7. then D’[ i, j ] D[ i, k ] + D[ k, j ]
8. P[ i, j ] k;
9. else D’[ i, j ] D[ i, j ]
10. Move D’ to D.
Can we use only one D matrix?
D[i,j] depends only on elements in the kth
column and row of the distance matrix.
We will show that the kth row and the kth
column of the distance matrix are unchanged
when Dk is computed
This means D can be calculated in-place
The main diagonal values
Before we show that kth row and column of D
remain unchanged we show that the main
diagonal remains 0
D(k)[ j,j ] = min{ D(k-1)[ j,j ] , D(k-1)[ j,k ] + D(k-1)[ k,j
] }
= min{ 0, D(k-1)[ j,k ] + D(k-1)[ k,j ] }
= 0
Based on which assumption?
The kth column
kth column of Dk is equal to the kth column of
Dk-1
Intuitively true - a path from i to k will not
become shorter by adding k to the allowed
subset of intermediate vertices
For all i, D(k)[i,k] =
= min{ D(k-1)[i,k], D(k-1)[i,k]+ D(k-1)[k,k] }
= min { D(k-1)[i,k], D(k-1)[i,k]+0 }
= D(k-1)[i,k]
The kth row
kth row of Dk is equal to the kth row of Dk-1
For all j, D(k)[k,j] =
= min{ D(k-1)[k,j], D(k-1)[k,k]+ D(k-1)[k,j] }
= min{ D(k-1)[ k,j ], 0+D(k-1)[k,j ] }
= D(k-1)[ k,j ]
Floyd's Algorithm using a single D
Floyd
1. D W // initialize D array to W [ ]
2. P 0 // initialize P array to [0]
3. for k 1 to n
4. do for i 1 to n
5. do for j 1 to n
6. if (D[ i, j ] > D[ i, k ] + D[ k, j ] )
7. then D[ i, j ] D[ i, k ] + D[ k, j ]
8. P[ i, j ] k;
Printing intermediate nodes on shortest path from q to r
path(index q, r)
if (P[ q, r ]!=0)
path(q, P[q, r])
println( “v”+ P[q, r])
path(P[q, r], r)
return;
//no intermediate nodes
else return
Before calling path check D[q, r] < , and print
node q, after the call to
path print node r
30 0
0 0 1
2 0 0
1 2 3
1
2
3
P =
1
2
3
5
-3
2
4
Example
The values in parenthesis are the non zero P values.
The final distance matrix and P
Path(1, 4)
Path(1, 6) Path(6, 4)
Print
v6
Path(6, 3) Print
v3
Path(3, 4)
P(1, 6)=0
P(6, 3)=0 P(3, 4)=0
The intermediate nodes on the shortest path from 1 to 4 are v6, v3.
The shortest path is v1, v6, v3, v4.
The call tree for Path(1, 4)
CSE408
Maximum Flow
Lecture #30
Maximum Flow
Maximum Flow Problem
The Ford-Fulkerson method
Maximum bipartite matching
Flow networks:
A flow network G=(V,E): a directed graph, where each edge
(u,v)E has a nonnegative capacity c(u,v)>=0.
If (u,v)E, we assume that c(u,v)=0.
two distinct vertices :a source s and a sink t.
s t
16
12
20
10 497
4
13
14
Flow:
G=(V,E): a flow network with capacity function c.
s-- the source and t-- the sink.
A flow in G: a real-valued function f:V*V R satisfying
the following two properties:
Capacity constraint: For all u,v V,
we require f(u,v) c( u,v).
Flow conservation: For all u V-{s,t}, we require
=
vine voute
efef
.. ..
)()(
Net flow and value of a flow f:
The quantity f (u,v) is called the net flow from vertex
u to vertex v.
The value of a flow is defined as
The total flow from source to any other vertices.
The same as the total flow from any vertices to
the sink.
=
Vv
vsff ),(
s t
11/16
12/12
15/20
10 1/4 4/9 7/7
4/48/13
11/14
A flow f in G with value .
19=f
Maximum-flow problem:
Given a flow network G with source s and sink t
Find a flow of maximum value from s to t.
How to solve it efficiently?
The Ford-Fulkerson method:
This section presents the Ford-Fulkerson method for solving
the maximum-flow problem.We call it a “method” rather than
an “algorithm” because it encompasses several
implementations with different running times.The Ford-
Fulkerson method depends on three important ideas that
transcend the method and are relevant to many flow
algorithms and problems: residual networks,augmenting
paths,and cuts. These ideas are essential to the important
max-flow min-cut theorem,which characterizes the value of
maximum flow in terms of cuts of the flow network.
Continue:
FORD-FULKERSON-METHOD(G,s,t)
initialize flow f to 0
while there exists an augmenting path p
do augment flow f along p
return f
Residual networks:
Given a flow network and a flow, the residual
network consists of edges that can admit more net
flow.
G=(V,E) --a flow network with source s and sink t
f: a flow in G.
The amount of additional net flow from u to v
before exceeding the capacity c(u,v) is the residual
capacity of (u,v), given by: cf(u,v)=c(u,v)-f(u,v)
in the other direction: cf(v, u)=c(v, u)+f(u, v).
s
20
7
9
v2v4
t
v3
v1
16
13
12
10 4
4
14
4/9
s
v2v4
t
v3
v1
4/16
13
4/12
10 4
20
4/4
7
4/14
(a)
Example of residual network
5
s
v2v4
t
v3
v1
12
13
8
10 4
20
4
7
4
4
4
4
10
(b)
Example of Residual network (continued)
Fact 1:
Let G=(V,E) be a flow network with source s and sink t, and let
f be a flow in G.
Let Gf be the residual network of G induced by f,and let f’ be a
flow in Gf.Then, the flow sum f+f’ is a flow in G with
value .
f+f’: the flow in the same direction will be added.
the flow in different directions will be cnacelled.
'' ffff +=+
Augmenting paths:
Given a flow network G=(V,E) and a flow f, an augmenting
path is a simple path from s to t in the residual network Gf.
Residual capacity of p : the maximum amount of net flow that
we can ship along the edges of an augmenting path p, i.e.,
cf(p)=min{cf(u,v):(u,v) is on p}.
2 3 1
The residual capacity is 1.
5
s
v2v4
t
v3
v1
12
13
8
10 4
20
4
7
4
4
4
4
10
(b)
Example of an augment path (bold edges)
The basic Ford-Fulkerson algorithm:
FORD-FULKERSON(G,s,t)
for each edge (u,v) E[G]
do f[u,v] 0
f[v,u] 0
while there exists a path p from s to t in the residual network
Gf
do cf(p) min{cf(u,v): (u,v) is in p}
for each edge (u,v) in p
do f[u,v] f[u,v]+cf(p)
Example:
The execution of the basic Ford-Fulkerson algorithm.
(a)-(d) Successive iterations of the while loop: The left side of
each part shows the residual network Gf from line 4 with a
shaded augmenting path p.The right side of each part shows
the new flow f that results from adding fp to f.The residual
network in (a) is the input network G.(e) The residual network
at the last while loop test.It has no augmenting paths,and the
flow f shown in (d) is therefore a maximum flow.
s
20
7
9
v2v4
t
v3
v1
16
13
12
10 4
4
14
4/9
s
v2v4
t
v3
v1
4/16
13
4/12
10 4
20
4/4
7
4/14
(a)
5
s
v2v4
t
v3
v1
12
13
8
10 4
20
4
7
4
4
4
4
10
4/9
s
v2v4
t
v3
v1
11/16
13
4/12
7/10 4
7/20
4/4
7/7
11/14
(b)
5
s
v2v4
t
v3
v1
5
13
8
311
4
7
11
11
4
4
3
13
7
4/9
s
v2v4
t
v3
v1
11/16
8/13
12/12
10 1/4
15/20
4/4
7/7
11/14
(c)
5
s
v2v4
t
v3
v1
5
8
11 3
4
7
11
11
12
4
3
5
15
59
s
v2v4
t
v3
v1
11/16
12/13
12/12
10 1/4
19/20
4/4
7/7
11/14
(d)
s
v2v4
t
v3
v1
5
12
12
11 3
1
4
11
9
3
1
19
7
(e)
Time complexity:
If each c(e) is an integer, then time complexity is O(|E|f*),
where f* is the maximum flow.
Reason: each time the flow is increased by at least one.
This might not be a polynomial time algorithm since f* can be
represented by log (f*) bits. So, the input size might be log(f*).
The Edmonds-Karp algorithm
Find the augmenting path using breadth-first search.
Breadth-first search gives the shortest path for
graphs (Assuming the length of each edge is 1.)
Time complexity of Edmonds-Karp algorithm is
O(VE2).
The proof is very hard and is not required here.
Maximum bipartite matching:
Bipartite graph: a graph (V, E), where V=LR, L∩R=empty, and
for every (u, v)E, u L and v R.
Given an undirected graph G=(V,E), a matching is a subset of
edges ME such that for all vertices vV,at most one edge of
M is incident on v.We say that a vertex v V is matched by
matching M if some edge in M is incident on v;otherwise, v is
unmatched. A maximum matching is a matching of maximum
cardinality,that is, a matching M such that for any matching M’,
we have .
L R
A bipartite graph G=(V,E) with vertex partition V=LR.(a)A matching with
cardinality 2.(b) A maximum matching with cardinality 3.
(a)
L R
(b)
Finding a maximum bipartite matching:
We define the corresponding flow network G’=(V,E’) for the
bipartite graph G as follows. Let the source s and sink t be
new vertices not in V, and let V’=V{s,t}.If the vertex
partition of G is V=L R, the directed edges of G’ are given by
E’={(s,u):uL} {(u,v):u L,v R,and (u,v) E} {(v,t):v
R}.Finally, we assign unit capacity to each edge in E’.
We will show that a matching in G corresponds directly to a
flow in G’s corresponding flow network G’. We say that a flow
f on a flow network G=(V,E) is integer-valued if f(u,v) is an
integer for all (u,v) V*V.
L R
(a)
L R
(b)
st
(a)The bipartite graph G=(V,E) with vertex partition V=LR. A
maximum matching is shown by shaded edges.(b) The corresponding
flow network.Each edge has unit capacity.Shaded edges have
a flow of 1,and all other edges carry no flow.
Continue:
Lemma .
Let G=(V,E) be a bipartite graph with vertex partition
V=LR,and let G’=(V,E’) be its corresponding flow network.If
M is a matching in G, then there is an integer-valued flow f in
G’ with value .Conversely, if f is an integer-valued
flow in G’,then there is a matching M in G with
cardinality .
Reason: The edges incident to s and t ensures this.
Each node in the first column has in-degree 1
Each node in the second column has out-degree 1.
So each node in the bipartite graph can be involved once in the flow.
Mf =
fM =
Example:
Aug. path:
Residual network. Red edges are new edges in the residual network.
The new aug. path is bold. Green edges are old aug. path. old flow=1.
1
1
1
1
1
1
1